{
  "cells": [
    {
      "cell_type": "raw",
      "id": "afaf8039",
      "metadata": {
        "vscode": {
          "languageId": "raw"
        }
      },
      "source": [
        "---\n",
        "sidebar_label: OpenAI\n",
        "---"
      ]
    },
    {
      "cell_type": "markdown",
      "id": "9a3d6f34",
      "metadata": {},
      "source": [
        "# OpenAI\n",
        "\n",
        "This will help you get started with OpenAIEmbeddings [embedding models](/docs/concepts/embedding_models) using LangChain. For detailed documentation on `OpenAIEmbeddings` features and configuration options, please refer to the [API reference](https://api.js.langchain.com/classes/langchain_openai.OpenAIEmbeddings.html).\n",
        "\n",
        "## Overview\n",
        "### Integration details\n",
        "\n",
        "| Class | Package | Local | [Py support](https://python.langchain.com/docs/integrations/text_embedding/openai/) | Package downloads | Package latest |\n",
        "| :--- | :--- | :---: | :---: |  :---: | :---: |\n",
        "| [OpenAIEmbeddings](https://api.js.langchain.com/classes/langchain_openai.OpenAIEmbeddings.html) | [@langchain/openai](https://api.js.langchain.com/modules/langchain_openai.html) | ❌ | ✅ | ![NPM - Downloads](https://img.shields.io/npm/dm/@langchain/openai?style=flat-square&label=%20&) | ![NPM - Version](https://img.shields.io/npm/v/@langchain/openai?style=flat-square&label=%20&) |\n",
        "\n",
        "## Setup\n",
        "\n",
        "To access OpenAIEmbeddings embedding models you'll need to create an OpenAI account, get an API key, and install the `@langchain/openai` integration package.\n",
        "\n",
        "### Credentials\n",
        "\n",
        "Head to [platform.openai.com](https://platform.openai.com) to sign up to OpenAI and generate an API key. Once you've done this set the `OPENAI_API_KEY` environment variable:\n",
        "\n",
        "```bash\n",
        "export OPENAI_API_KEY=\"your-api-key\"\n",
        "```\n",
        "\n",
        "If you want to get automated tracing of your model calls you can also set your [LangSmith](https://docs.smith.langchain.com/) API key by uncommenting below:\n",
        "\n",
        "```bash\n",
        "# export LANGSMITH_TRACING=\"true\"\n",
        "# export LANGSMITH_API_KEY=\"your-api-key\"\n",
        "```\n",
        "\n",
        "### Installation\n",
        "\n",
        "The LangChain OpenAIEmbeddings integration lives in the `@langchain/openai` package:\n",
        "\n",
        "```{=mdx}\n",
        "import IntegrationInstallTooltip from \"@mdx_components/integration_install_tooltip.mdx\";\n",
        "import Npm2Yarn from \"@theme/Npm2Yarn\";\n",
        "\n",
        "<IntegrationInstallTooltip></IntegrationInstallTooltip>\n",
        "\n",
        "<Npm2Yarn>\n",
        "  @langchain/openai @langchain/core\n",
        "</Npm2Yarn>\n",
        "```"
      ]
    },
    {
      "cell_type": "markdown",
      "id": "45dd1724",
      "metadata": {},
      "source": [
        "## Instantiation\n",
        "\n",
        "Now we can instantiate our model object and generate chat completions:"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 1,
      "id": "9ea7a09b",
      "metadata": {},
      "outputs": [],
      "source": [
        "import { OpenAIEmbeddings } from \"@langchain/openai\";\n",
        "\n",
        "const embeddings = new OpenAIEmbeddings({\n",
        "  apiKey: \"YOUR-API-KEY\", // In Node.js defaults to process.env.OPENAI_API_KEY\n",
        "  batchSize: 512, // Default value if omitted is 512. Max is 2048\n",
        "  model: \"text-embedding-3-large\",\n",
        "});"
      ]
    },
    {
      "cell_type": "markdown",
      "id": "fb4153d3",
      "metadata": {},
      "source": [
        "If you're part of an organization, you can set `process.env.OPENAI_ORGANIZATION` to your OpenAI organization id, or pass it in as `organization` when\n",
        "initializing the model."
      ]
    },
    {
      "cell_type": "markdown",
      "id": "77d271b6",
      "metadata": {},
      "source": [
        "## Indexing and Retrieval\n",
        "\n",
        "Embedding models are often used in retrieval-augmented generation (RAG) flows, both as part of indexing data as well as later retrieving it. For more detailed instructions, please see our RAG tutorials under the [working with external knowledge tutorials](/docs/tutorials/#working-with-external-knowledge).\n",
        "\n",
        "Below, see how to index and retrieve data using the `embeddings` object we initialized above. In this example, we will index and retrieve a sample document using the demo [`MemoryVectorStore`](/docs/integrations/vectorstores/memory)."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 3,
      "id": "d817716b",
      "metadata": {},
      "outputs": [
        {
          "name": "stdout",
          "output_type": "stream",
          "text": [
            "LangChain is the framework for building context-aware reasoning applications\n"
          ]
        }
      ],
      "source": [
        "// Create a vector store with a sample text\n",
        "import { MemoryVectorStore } from \"langchain/vectorstores/memory\";\n",
        "\n",
        "const text = \"LangChain is the framework for building context-aware reasoning applications\";\n",
        "\n",
        "const vectorstore = await MemoryVectorStore.fromDocuments(\n",
        "  [{ pageContent: text, metadata: {} }],\n",
        "  embeddings,\n",
        ");\n",
        "\n",
        "// Use the vector store as a retriever that returns a single document\n",
        "const retriever = vectorstore.asRetriever(1);\n",
        "\n",
        "// Retrieve the most similar text\n",
        "const retrievedDocuments = await retriever.invoke(\"What is LangChain?\");\n",
        "\n",
        "retrievedDocuments[0].pageContent;"
      ]
    },
    {
      "cell_type": "markdown",
      "id": "e02b9855",
      "metadata": {},
      "source": [
        "## Direct Usage\n",
        "\n",
        "Under the hood, the vectorstore and retriever implementations are calling `embeddings.embedDocument(...)` and `embeddings.embedQuery(...)` to create embeddings for the text(s) used in `fromDocuments` and the retriever's `invoke` operations, respectively.\n",
        "\n",
        "You can directly call these methods to get embeddings for your own use cases.\n",
        "\n",
        "### Embed single texts\n",
        "\n",
        "You can embed queries for search with `embedQuery`. This generates a vector representation specific to the query:"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 4,
      "id": "0d2befcd",
      "metadata": {},
      "outputs": [
        {
          "name": "stdout",
          "output_type": "stream",
          "text": [
            "[\n",
            "    -0.01927683,  0.0037708976,  -0.032942563,  0.0037671267,  0.008175306,\n",
            "   -0.012511838,  -0.009713832,   0.021403614,  -0.015377721, 0.0018684798,\n",
            "    0.020574018,   0.022399133,   -0.02322873,   -0.01524951,  -0.00504169,\n",
            "   -0.007375876,   -0.03448109, 0.00015130726,   0.021388533, -0.012564631,\n",
            "   -0.020031009,   0.027406884,  -0.039217334,    0.03036327,  0.030393435,\n",
            "   -0.021750538,   0.032610722,  -0.021162277,  -0.025898525,  0.018869571,\n",
            "    0.034179416,  -0.013371604,  0.0037652412,   -0.02146395, 0.0012641934,\n",
            "   -0.055688616,    0.05104287,  0.0024982197,  -0.019095825, 0.0037369595,\n",
            "  0.00088757504,   0.025189597,  -0.018779071,   0.024978427,  0.016833287,\n",
            "  -0.0025868358,  -0.011727491, -0.0021154736,  -0.017738303, 0.0013839195,\n",
            "  -0.0131151825,   -0.05405959,   0.029729757,  -0.003393808,  0.019774588,\n",
            "    0.028885076,   0.004355387,   0.026094612,    0.06479911,  0.038040817,\n",
            "    -0.03478276,  -0.012594799,  -0.024767255, -0.0031430433,  0.017874055,\n",
            "   -0.015294761,   0.005709139,   0.025355516,   0.044798266,   0.02549127,\n",
            "    -0.02524993, 0.00014553308,  -0.019427665,  -0.023545485,  0.008748483,\n",
            "    0.019850006,  -0.028417485,  -0.001860938,   -0.02318348, -0.010799851,\n",
            "     0.04793565, -0.0048983963,    0.02193154,  -0.026411368,  0.026426451,\n",
            "   -0.012149832,   0.035355937,  -0.047814984,  -0.027165547, -0.008228099,\n",
            "   -0.007737882,   0.023726488,  -0.046487626,  -0.007783133, -0.019638835,\n",
            "     0.01793439,  -0.018024892,  0.0030336871,  -0.019578502, 0.0042837397\n",
            "]\n"
          ]
        }
      ],
      "source": [
        "const singleVector = await embeddings.embedQuery(text);\n",
        "\n",
        "console.log(singleVector.slice(0, 100));"
      ]
    },
    {
      "cell_type": "markdown",
      "id": "1b5a7d03",
      "metadata": {},
      "source": [
        "### Embed multiple texts\n",
        "\n",
        "You can embed multiple texts for indexing with `embedDocuments`. The internals used for this method may (but do not have to) differ from embedding queries:"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 5,
      "id": "2f4d6e97",
      "metadata": {},
      "outputs": [
        {
          "name": "stdout",
          "output_type": "stream",
          "text": [
            "[\n",
            "    -0.01927683,  0.0037708976,  -0.032942563,  0.0037671267,  0.008175306,\n",
            "   -0.012511838,  -0.009713832,   0.021403614,  -0.015377721, 0.0018684798,\n",
            "    0.020574018,   0.022399133,   -0.02322873,   -0.01524951,  -0.00504169,\n",
            "   -0.007375876,   -0.03448109, 0.00015130726,   0.021388533, -0.012564631,\n",
            "   -0.020031009,   0.027406884,  -0.039217334,    0.03036327,  0.030393435,\n",
            "   -0.021750538,   0.032610722,  -0.021162277,  -0.025898525,  0.018869571,\n",
            "    0.034179416,  -0.013371604,  0.0037652412,   -0.02146395, 0.0012641934,\n",
            "   -0.055688616,    0.05104287,  0.0024982197,  -0.019095825, 0.0037369595,\n",
            "  0.00088757504,   0.025189597,  -0.018779071,   0.024978427,  0.016833287,\n",
            "  -0.0025868358,  -0.011727491, -0.0021154736,  -0.017738303, 0.0013839195,\n",
            "  -0.0131151825,   -0.05405959,   0.029729757,  -0.003393808,  0.019774588,\n",
            "    0.028885076,   0.004355387,   0.026094612,    0.06479911,  0.038040817,\n",
            "    -0.03478276,  -0.012594799,  -0.024767255, -0.0031430433,  0.017874055,\n",
            "   -0.015294761,   0.005709139,   0.025355516,   0.044798266,   0.02549127,\n",
            "    -0.02524993, 0.00014553308,  -0.019427665,  -0.023545485,  0.008748483,\n",
            "    0.019850006,  -0.028417485,  -0.001860938,   -0.02318348, -0.010799851,\n",
            "     0.04793565, -0.0048983963,    0.02193154,  -0.026411368,  0.026426451,\n",
            "   -0.012149832,   0.035355937,  -0.047814984,  -0.027165547, -0.008228099,\n",
            "   -0.007737882,   0.023726488,  -0.046487626,  -0.007783133, -0.019638835,\n",
            "     0.01793439,  -0.018024892,  0.0030336871,  -0.019578502, 0.0042837397\n",
            "]\n",
            "[\n",
            "   -0.010181213,   0.023419594,   -0.04215527, -0.0015320902,  -0.023573855,\n",
            "  -0.0091644935,  -0.014893179,   0.019016149,  -0.023475688,  0.0010219777,\n",
            "    0.009255648,    0.03996757,   -0.04366983,   -0.01640774,  -0.020194141,\n",
            "    0.019408813,  -0.027977299,  -0.022017224,   0.013539891,  -0.007769135,\n",
            "    0.032647192,  -0.015089511,  -0.022900717,   0.023798235,   0.026084099,\n",
            "   -0.024625633,   0.035003178,  -0.017978394,  -0.049615882,   0.013364594,\n",
            "    0.031132633,   0.019142363,   0.023195215,  -0.038396914,   0.005584942,\n",
            "   -0.031946007,   0.053682756, -0.0036356465,   0.011240003,  0.0056690844,\n",
            "  -0.0062791156,   0.044146635,  -0.037387207,    0.01300699,   0.018946031,\n",
            "   0.0050415234,   0.029618073,  -0.021750772,  -0.000649473, 0.00026951815,\n",
            "   -0.014710871,  -0.029814405,    0.04204308,  -0.014710871,  0.0039616977,\n",
            "   -0.021512369,   0.054608323,   0.021484323,    0.02790718,  -0.010573876,\n",
            "   -0.023952495,  -0.035143413,  -0.048802506, -0.0075798146,   0.023279356,\n",
            "   -0.022690361,  -0.016590048,  0.0060477243,   0.014100839,   0.005476258,\n",
            "   -0.017221114, -0.0100059165,  -0.017922299,  -0.021989176,    0.01830094,\n",
            "     0.05516927,   0.001033372,  0.0017310516,   -0.00960624,  -0.037864015,\n",
            "    0.013063084,   0.006591143,  -0.010160177,  0.0011394264,    0.04953174,\n",
            "    0.004806626,   0.029421741,  -0.037751824,   0.003618117,   0.007162609,\n",
            "    0.027696826, -0.0021070621,  -0.024485396, -0.0042141243,   -0.02801937,\n",
            "   -0.019605145,   0.016281527,  -0.035143413,    0.01640774,   0.042323552\n",
            "]\n"
          ]
        }
      ],
      "source": [
        "const text2 = \"LangGraph is a library for building stateful, multi-actor applications with LLMs\";\n",
        "\n",
        "const vectors = await embeddings.embedDocuments([text, text2]);\n",
        "\n",
        "console.log(vectors[0].slice(0, 100));\n",
        "console.log(vectors[1].slice(0, 100));"
      ]
    },
    {
      "cell_type": "markdown",
      "id": "2b1a3527",
      "metadata": {},
      "source": [
        "## Specifying dimensions\n",
        "\n",
        "With the `text-embedding-3` class of models, you can specify the size of the embeddings you want returned. For example by default `text-embedding-3-large` returns embeddings of dimension 3072:\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 2,
      "id": "a611fe1a",
      "metadata": {},
      "outputs": [
        {
          "name": "stdout",
          "output_type": "stream",
          "text": [
            "3072\n"
          ]
        }
      ],
      "source": [
        "import { OpenAIEmbeddings } from \"@langchain/openai\";\n",
        "\n",
        "const embeddingsDefaultDimensions = new OpenAIEmbeddings({\n",
        "  model: \"text-embedding-3-large\",\n",
        "});\n",
        "\n",
        "const vectorsDefaultDimensions = await embeddingsDefaultDimensions.embedDocuments([\"some text\"]);\n",
        "console.log(vectorsDefaultDimensions[0].length);"
      ]
    },
    {
      "cell_type": "markdown",
      "id": "08efe771",
      "metadata": {},
      "source": [
        "But by passing in `dimensions: 1024` we can reduce the size of our embeddings to 1024:"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 3,
      "id": "19667fdb",
      "metadata": {},
      "outputs": [
        {
          "name": "stdout",
          "output_type": "stream",
          "text": [
            "1024\n"
          ]
        }
      ],
      "source": [
        "import { OpenAIEmbeddings } from \"@langchain/openai\";\n",
        "\n",
        "const embeddings1024 = new OpenAIEmbeddings({\n",
        "  model: \"text-embedding-3-large\",\n",
        "  dimensions: 1024,\n",
        "});\n",
        "\n",
        "const vectors1024 = await embeddings1024.embedDocuments([\"some text\"]);\n",
        "console.log(vectors1024[0].length);"
      ]
    },
    {
      "cell_type": "markdown",
      "id": "6b84c0df",
      "metadata": {},
      "source": [
        "## Custom URLs\n",
        "\n",
        "You can customize the base URL the SDK sends requests to by passing a `configuration` parameter like this:"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "id": "3bfa20a6",
      "metadata": {},
      "outputs": [],
      "source": [
        "import { OpenAIEmbeddings } from \"@langchain/openai\";\n",
        "\n",
        "const model = new OpenAIEmbeddings({\n",
        "  configuration: {\n",
        "    baseURL: \"https://your_custom_url.com\",\n",
        "  },\n",
        "});"
      ]
    },
    {
      "cell_type": "markdown",
      "id": "ac3cac9b",
      "metadata": {},
      "source": [
        "You can also pass other `ClientOptions` parameters accepted by the official SDK.\n",
        "\n",
        "If you are hosting on Azure OpenAI, see the [dedicated page instead](/docs/integrations/text_embedding/azure_openai)."
      ]
    },
    {
      "cell_type": "markdown",
      "id": "8938e581",
      "metadata": {},
      "source": [
        "## API reference\n",
        "\n",
        "For detailed documentation of all OpenAIEmbeddings features and configurations head to the API reference: https://api.js.langchain.com/classes/langchain_openai.OpenAIEmbeddings.html"
      ]
    }
  ],
  "metadata": {
    "kernelspec": {
      "display_name": "TypeScript",
      "language": "typescript",
      "name": "tslab"
    },
    "language_info": {
      "codemirror_mode": {
        "mode": "typescript",
        "name": "javascript",
        "typescript": true
      },
      "file_extension": ".ts",
      "mimetype": "text/typescript",
      "name": "typescript",
      "version": "3.7.2"
    }
  },
  "nbformat": 4,
  "nbformat_minor": 5
}